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Abstract. The LISA Pathfinder data analysis team has been developing in the last years the 
infrastructure and methods required to run the mission during flight operations. These are 
gathered in the LTPDA toolbox, an object oriented MATLAB toolbox that allows all the data 
analysis functionalities for the mission, while storing the history of all operations performed to 
the data, thus easing traceability and reproducibility of the analysis. The parameter estimation 
methods in the toolbox have been applied recently to data sets generated with the OSE (Off-line 
Simulations Environment), a detailed LISA Pathfinder non- linear simulator that will serve as a 
reference simulator during mission operations. These simulations, so called operational exercises, 
are the last verification step before translating these experiments into tele-command sequences 
for the spacecraft, producing therefore very relevant datasets to test our data analysis methods. 
In this contribution we report the results obtained with three different parameter estimation 
methods during one of these operational exercises. 



1. Introduction 

The main objective of the LISA Pathfinder (LPF) mission 2\ is to characterise the purity 
of the free fah of two test masses in nominal geodesic motion in its orbit around the Lagrange 
point LI. The main scientific objective of the mission is specified in a spectral density of relative 
acceleration fluctuations between test masses of SAa < 3 X 10-i^ms-2Hz-V2 at ImHz. Such 
an unprecedented metrology sensitivity in a space mission would be considered as a technology 
achievement sound enough to guarantee a successful development of a future gravitational wave 
observatory in space. The most mature concept for this space gravitational wave mission is 
LISA ^3J, which was recently abandoned as a joint ESA/NASA effort, although the concept is 
still being actively pursued independently by both space agencies. 
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An important aspect of the LISA Pathfinder mission is the need of a flexible data analysis 
infrastructure able to help the taking of decisions since, during operations, the LPF must be 
understood as an in-flight experiment with hundreds of channels being sampled and a tight 
schedule of experiments to be executed sequentially. 

The LISA Pathfinder data analysis team has been developing in the last years the 
infrastructure and methods required to run the mission during flight operations. These are 
gathered in the LTPDA toolbox [4J, an object oriented MATLAB toolbox that allows all the 
data analysis functionalities for the mission, while storing the history of all operations performed 
to the data, thus easing traceability and reproducibility of the analysis. 

In this contribution we focus on three parameter estimation methods — time-domain linear 
estimation, time-domain non-linear estimation and Markov Chain Monte Carlo in frequency 
domain — implemented in the toolbox and the comparison of their results when applied with 
simulated data. The aim of the team when performing this exercise was twofold: to compare the 
consistency between methods and to test themselves against the most realistic data available. 

2. The OSE simulator and the sixth operational exercise 

The OSE (Off-line Simulations Environment) is a detailed LISA Pathfinder non-linear simulator 
that will serve as a reference simulator during mission operations. This simulator is the one being 
used during a series of runs being regularly executed, the so called operational exercises. These 
exercises with simulated data aim at testing the on-orbit experiments in a realistic environment 
in terms of software and time constraints. The operational exercises are the last verification 
step before translating experiments into tele-command sequences for the spacecraft, producing 
therefore very relevant datasets to test our data analysis methods. 

The data being analysed here correspond to the sixth operational exercise, which was run 
on April 2011. The simulated data correspond to two runs — called investigations — of around 
8 hours duration. In each of these a sequence of sinusoids of different frequency is injected in 
the first and the second channel, for the first and the second investigation respectively. These 
two runs excite the system making use of the two only input channels that we consider in the 
current model, however the real LTP will count with many other excitation mechanism, like 
the application of direct forces to the test mass, and hence will require a longer sequence of 
investigations to fully characterise its dynamics. 

3. Model description 

The model of the LTP considered here for the analysis is an improved version of the one 
previously used in previous works [5], which includes actuators and delays. This same model has 
been considered for similar studies in [GJE]. In the current analysis, we describe the measurement 
of an experiment in the LTP in the frequency domain as 

o = G X Oin + n, (1) 

where o stands for the measured interferometer displacement; Oin are the injected test signals 
and n the noise vector. All of them two components vectors, one component for each of 
the measurement channels considered here, the first one measuring the distance between the 
spacecraft and the first test mass, Ox^i, and the second one measuring the distance between the 
two test masses, Ox,i2 • 

0=f ), 0,n-( "-'^ ), ). (2) 

The transfer functions describing input to output relation in Eq.Q are contained in the 
matrix G, 

G = (D X (DTifo X S)-i + Cioop)-' X Cioop x DTi„ (3) 




Figure 1. Box diagram of the model considered in the current analysis. Boxes are systems, 
rhombus represent injection points and triangles stand for cross-couplings. Notation and symbols 
are explained in the text. 



with 

Cioop = C X A X H (4) 

where we have split the control box, Cioop, in three subsystems: inertia decoupling matrix, C; a 
matrix with the Drag-Free and Attitude Control Subsystem (DFACS) controllers expressed as 
transfer functions, H; and the matrix A with actuator gain and delays. These matrices can be 
expressed as 

/I 1 \ 



C = , (5) 

where Hdf(cc;) and Hsus(cc;) are (known) transfer function in frequency domain. And, 

Aie-- \ 
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where Ai and A2 are actuator gains for the first — the Field-emission electric propulsion 
(FEEP) — and the second — the electrostatic capacitive actuators — channel, ri and T2 are 
delays simulating actuators responses. Default values are ti = 0.2 s and T2 = s. 



The dynamics of the test mass are contained in the fohowing matrix 



where we define the first and second test mass masses, mi and 7712; the spacecraft mass, msc'-, 
first and differential stiffness, cji and cjui^ ^2 ~ ^1)5 gravitational cross-coupling between 
both test masses, T^x- Iii this model, the interferometer is represented by a sensing matrix, S, 
translating physical test mass displacements into interferometric read-out, 

t ) 

The four elements are interferometric calibration factors. The non-diagonal are the possible 
cross-couplings mixing both channels. Processing of interferometer data introduces a delay of 
0.4 s on both channels which is indicated by DTjfo in the model — see figure [T] We consider 
as well a second delay coming from the guidance inputs, this could be useful to compensate for 
a lack of information on the true time that guidance command is applied inside the DFACS 
software. Guidance delays are indicated by DTin block in the model. Mathematically, both 
blocks are equivalent: 

/ -s-ATifo,i \ / p-s-ATin,i f) \ 

DTifo^l^ ^-s.AT,^,y DTi„=(^ Q ,-..AT.„, j- (10) 

The model is implemented in the LTPDA toolbox as an analytical expression for each of the 
components. In future version, this model will be implemented as an state-space model in order 
to enhance its integration with the rest of the models of the toolbox. 



4. Description of the parameter estimation methods 

In the following we provide a summary of the three methods used in the analysis of the sixth 
operational exercise. More details for each method can be found in the interleaved references. 



Linear Fit The procedure for this method is based on a model linearisation to first order. The 
method performs a time domain parameter estimation. The different steps of the fit process are 
summarized as follow [8j: 

(i) Data for all investigations and channels are whitened. The whitening filter is obtained 
through the inverse of the power spectrum for a data segment where no signal is injected, 
i.e. where we are truly evaluating the noise performance of the instrument. 

(ii) Model response to current parameters values is calculated and whitened. This is then 
subtracted from the data. 

(iii) A first order expansion (in terms of the parameters) of the model is calculated and whitened. 
The set of different series corresponding to different parameters represent the fit basis. 

(iv) Each experiment can provide only a limited amount of information, therefore some 
parameters will be undetermined causing an impair of fit capability. Fit basis is then 
changed for each experiment by singular value decomposition (SVD) which provides a new 
fit basis in terms of a linear combination of the elements of the old (physical) basis. The 
corresponding parameters are linear combination of the original fit parameters. 

(v) SVD parameters are estimated for each experiment separately by the solution of the 
corresponding system of normal equations. 



Table 1. Results of the parameter estimation methods for the sixth operational exercise. 



Parameter 


Linear 

X ±a 




Non-linear 

X ±a 




MCMC 

X ± a 




Al 


1.0699 ± 0.0005 




1.0705 ±0.0006 




1.0694 ±0.0003 




A2 


0.99998 ± 0.00003 




0.99998 ± 0.00003 




0.99996 ± 0.00002 




S21 


(1.2 ±0.4) X 10-6 




(1.2 ±0.4) X 10-6 




(1.9 ±0.3) X 10-6 






-0.1982 ± 0.0005 




-0.1985 ± 0.0005 




-0.1998 ±0.0002 






-0.199 ±0.001 




-0.199 ±0.001 




-0.199 ±0.001 




^2 


(-1.319 ±0.002) X 


10-6 


(-1.319 ±0.002) X 


10-6 


(-1.319 ±0.002) X 


10-6 




(-7.160 ±0.006) X 


10-7 


(-7.160 ±0.006) X 


10-7 


(-7.150 ±0.005) X 


10-7 



(vi) Once the values of all SVD parameters is known with the corresponding uncertainties. The 
results for the different experiments are collected together and the system is inverted in 
order to come back to the desired fit parameters. 

(vii) New values for fit parameters are then updated and the procedure can start again in a loop 
until convergence is reached. If the ratio between squared difference of two consecutive 
parameters estimation and parameters variance is less than 1, then convergence is assumed 
and the loop is stopped. Parameters which have demonstrated minimum value for mean 
squared error during the loop are selected for the output. 

Non-linear fit In this case, the method implements a non-linear parameter estimation in time 
domain. The key features of the algorithm are [6J: 

(i) The joint test statistic, in sense, is computed on all available investigations and channels. 

(ii) Data are decorrelated by applying whitening filters previously estimated on noisy data 
stretches. 

(iii) To ease the search, parameters are bounded with lower and upper limits. 

(iv) The optimization starts from an initial guess by using a derivative- free algorithm (simplex). 
Eventually, if the minimum is too far away from the optimum, a preliminary large-scale 
gradient-based algorithm (BFGS Quasi-Newton) is also apphed. 

(v) Estimated 1-a error are obtained by inverting the Fisher matrix of analytical first-order 
derivatives. 

Markov Chain Monte Carlo (MCMC) The MCMC method implemented in LTPDA performs 
a Monte Carlo integration of the log-likelihood function computed in frequency domain. In 
summary, the algorithn can be split in following steps [5j : 

(i) The Fast Fourier transform is applied to segments where signals are injected and the power 
spectrum is computed to evaluate the noise level in a signal free segment. 

(ii) The expected covariance matrix of the parameters is computed as the inverse of the Fisher 
matrix. 

(iii) The method can optionally look for the maximum likelihood parameters using the simplex 
algorithm. 

(iv) Sample the log-likehood surface with the Metropolis algorithm. Initially, the method applies 
an annealing to guarantee a correct exploration of the parameter space. There is a cooling 
down phase before starting the integration of the likelihood surface around the likelihood 



Table 2. Results of the Kolmogorov - Smirnov test for the sixth operational exercise. 



Investigation 


Method 


Test 


P- Value 


Test 


Critical 








Result 




Statistic 


Value @95% 


±liV . 


1 n 1 


Linear 


Reject 


C\ C\C\ A C\ 

U.UU49 


O.Ooz 


0.041 


Inv. 




Non-Linear 


Reject 


0.0046 


0.052 


0.041 


Inv. 




MCMC 


Reject 


0.0048 


0.052 


0.041 


Inv. 




Linear 


Not Reject 


0.6480 


0.022 


0.041 


Inv. 




Non-Linear 


Not Reject 


0.6480 


0.022 


0.041 


Inv. 


1 Ox;V2 


MCMC 


Not Reject 


0.6543 


0.022 


0.041 


Inv. 


1 


Linear 


Not Reject 


0.9801 


0.014 


0.041 


Inv. 


2 Ox;v 


Non-Linear 


Not Reject 


0.9801 


0.014 


0.041 


Inv. 


2 Oa;^i 


MCMC 


Not Reject 


0.9801 


0.014 


0.041 


Inv. 


2 Oa;^i2 


Linear 


Not Reject 


0.4631 


0.026 


0.041 


Inv. 


2 Oa;^i2 


Non-Linear 


Not Reject 


0.4478 


0.026 


0.041 


Inv. 


2 Oa;^i2 


MCMC 


Not Reject 


0.4478 


0.026 


0.041 



maxima. During this phase, the covariance matrix which defines the proposal distribution 
is rescaled in some samples to improve convergence. The user sets all parameters that define 
these phases. 

(v) Sample the log-likehood surface around the maxima obtained in the previous phase until 
the chain reaches the number of samples set by the user. 

5. Parameter estimation methods comparison 

The previous methods were applied to the two investigations of the sixth operational exercise. 
Our Fisher matrix analysis showed that, given the two investigations available, we are only 
able to estimate 7 parameters independently. We selected the subset shown in table [TJ which 
produces a full rank Fisher matrix. In the table, the results for the three parameter estimation 
methods show a good agreement for the numerical value as well as for the estimated error on 
the parameter. Although not shown here, this uncertainty agrees with the one computed from 
the Fisher matrix analysis — more details on the Fisher matrix computation can be found in [5]. 

Although the methods seem in good agreement we developed some tools to quantify 
the agreement and check if the different parameter estimation methods provide us different 
information about the data being analysed. We discuss this in the next section. 

6. Statistical goodness-of-fit tests 

We test accuracy of our parameter estimation methods by statistical tests on residuals. Fit 
residuals are obtained subtracting model response from the measured signal. If the model and 
the fit process are accurate, then we expect to remove the deterministic signal from our measured 
signal. Therefore the residuals are expected to be equivalent to the noise output of the system 
— figures [2] and [3] show the response predicted for each model and the residuals for the two 
investigations under analysis, respectively. 

The main idea behind the test is that subtracting the estimated from the measured response 
should retrieve a noise curve which is statistically equivalent to the noise level of the system 
evaluated previously to the signal injection. There are two main limitation to such kind of test: 
i) The noise out of the system should be stationary and ii) A noise only time series must be 
available for comparison. The test we construct is based on the comparison of residuals and noise 
power spectral densities. In order to build a test for power spectral densities we should in general 




Figure 2. Investigation 1 in the 6th operational exercise. Top to bottom for channel Oxi (left) 
and Oxi2 (right): Comparison in time domain of the measured output against output predicted 
by the models resulting from each method (top), residuals in time domain (middle) and residuals 
in frequency domain compared to independent noise level estimate (bottom). 



^ Time origin: 2010-04-23 10:08:48.000 UTC 
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Figure 3. Investigation 2 in the 6th operational exercise. Top to bottom for channel Oxi (left) 
and Oxi2 (right): Comparison in time domain of the measured output against output predicted 
by the models resulting from each method (top), residuals in time domain (middle) and residuals 
in frequency domain compared to independent noise level estimate (bottom). 



face the problem of the statistical properties of the spectrum, which, in the case of Welch overlap 
windowed averaged periodogram, can be highly complex [9] and annoyingly dependent on data 
underlying distribution. We overcome the problem with the Kolmogorv-Smirnov test (KS-test) 
whose statistic is not dependent form the statistical properties of the data under test — a more 
detailed description of this method can be found in [7J. 

The null hypothesis for the test is to consider that residuals and noise are two random 
processes with the same underlying distribution, i.e. residuals and noise are two realizations 
of the same physical process. The null hypothesis is rejected or not on the basis of a given 
significance level which is fixed to 5% in the present case. Null hypothesis is rejected if the 
probability associated with the KS-statistic of the two data series is less than the required 
significance level. Results of the KS-test for our methods are reported in table [2j The test 
provides the same results for the three methods. Spectra of fit residuals for the first channel 
on the first investigation are not compatible with expected noise spectra, which shows that 
our methods are not able to completely remove the injected signal in the first channel for this 
investigation. This effect, that could show a mismatch between the model being used for the 
analysis and the one used to generate the data, is currently being investigated. In the other 3 
experiments the subtraction works effectively and the residuals spectra are compatible with the 
expected noise spectra. 

7. Summary and conclusions 

In this contribution we have presented the comparison between three different parameter 
estimation methods implemented in LTPDA applied to the the analysis of a set of data generated 
by a complex LTP simulator, the OSE. Results show that the three methods are in agreement 
when we perform the analysis of the two investigations in the sixth LTP operational exercise. 

We also presented a statistical analysis of the residuals obtained by each method when 
subtracting the expected response of the system to the measured one. All methods agree as 
well under this framework, however the Kolmogorov-Smirnov analysis show that the methods 
are not able to completely subtract the injected signal from the measured output in the first 
channel of the first investigation, i.e. the experiment where we inject a sequence of sinusoids 
in the first channel. This effect, enhanced by the high signal-to-noise ratio of the signal in this 
channel, may be due to a mismatch between the model used for the data analysis and the one 
used for data generation. It is worth reminding here that the OSE is a black-box simulator 
developed by Astrium GmbH and delivered to ESA under industrial contract. 

Next steps of the data analysis team include the improvement of the data analysis methods 
and focus on the analysis of the different experiments planned during flight operations. In 
parallel, the models implemented in the LTPDA toolbox to describe the LTP experiment are 
being implemented as state-space models, which will allow a more modular design and ease the 
code maintenance and testing. 
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